Ranking Sentences for Keyphrase Extraction: A Relational Data Mining Approach

نویسندگان

  • Michelangelo Ceci
  • Corrado Loglisci
  • Lucrezia Macchia
چکیده

Document summarization involves reducing a text document into a short set of phrases or sentences that convey the main meaning of the text. In digital libraries, summaries can be used as concise descriptions which the user can read for a rapid comprehension of the retrieved documents. Most of the existing approaches rely on the classification algorithms which tend to generate “crisp” summaries, where the phrases are considered equally relevant and no information on their degree of importance or factor of significance is provided. Motivated by this, we present a probabilistic relational data mining method to model preference relations on sentences of document images. Preference relations are then used to rank the sentences which will form the final summary. We empirically evaluate the method on real document images. c © 2014 The Authors. Published by Elsevier B.V. Peer-review under responsibility of the Scientific Committee of IRCDL 2014.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

WordTopic-MultiRank: A New Method for Automatic Keyphrase Extraction

Automatic keyphrase extraction aims to pick out a set of terms as a representation of a document without manual assignment efforts. Supervised and unsupervised graph-based ranking methods have been studied for this task. However, previous methods usually computed importance scores of words under the assumption of single relation between words. In this work, we propose WordTopic-MultiRank as a n...

متن کامل

KERT: Automatic Extraction and Ranking of Topical Keyphrases from Content-Representative Document Titles

We introduce KERT (Keyphrase Extraction and Ranking by Topic), a framework for topical keyphrase generation and ranking. By shifting from the unigram-centric traditional methods of unsupervised keyphrase extraction to a phrase-centric approach, we are able to directly compare and rank phrases of different lengths. We construct a topical keyphrase ranking function which implements the four crite...

متن کامل

CollabRank: Towards a Collaborative Approach to Single-Document Keyphrase Extraction

Previous methods usually conduct the keyphrase extraction task for single documents separately without interactions for each document, under the assumption that the documents are considered independent of each other. This paper proposes a novel approach named CollabRank to collaborative single-document keyphrase extraction by making use of mutual influences of multiple documents within a cluste...

متن کامل

Topical Keyphrase Extraction from Twitter

Summarizing and analyzing Twitter content is an important and challenging task. In this paper, we propose to extract topical keyphrases as one way to summarize Twitter. We propose a context-sensitive topical PageRank method for keyword ranking and a probabilistic scoring function that considers both relevance and interestingness of keyphrases for keyphrase ranking. We evaluate our proposed meth...

متن کامل

Opinion Expression Mining by Exploiting Keyphrase Extraction

In this paper, we shall introduce a system for extracting the keyphrases for the reason of authors’ opinion from product reviews. The datasets for two fairly different product review domains related to movies and mobile phones were constructed semiautomatically based on the pros and cons entered by the authors. The system illustrates that the classic supervised keyphrase extraction approach – m...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014